27 research outputs found

    FRASIMED: a Clinical French Annotated Resource Produced through Crosslingual BERT-Based Annotation Projection

    Full text link
    Natural language processing (NLP) applications such as named entity recognition (NER) for low-resource corpora do not benefit from recent advances in the development of large language models (LLMs) where there is still a need for larger annotated datasets. This research article introduces a methodology for generating translated versions of annotated datasets through crosslingual annotation projection. Leveraging a language agnostic BERT-based approach, it is an efficient solution to increase low-resource corpora with few human efforts and by only using already available open data resources. Quantitative and qualitative evaluations are often lacking when it comes to evaluating the quality and effectiveness of semi-automatic data generation strategies. The evaluation of our crosslingual annotation projection approach showed both effectiveness and high accuracy in the resulting dataset. As a practical application of this methodology, we present the creation of French Annotated Resource with Semantic Information for Medical Entities Detection (FRASIMED), an annotated corpus comprising 2'051 synthetic clinical cases in French. The corpus is now available for researchers and practitioners to develop and refine French natural language processing (NLP) applications in the clinical field (https://zenodo.org/record/8355629), making it the largest open annotated corpus with linked medical concepts in French

    Applying the FAIR4Health Solution to Identify Multimorbidity Patterns and Their Association with Mortality through a Frequent Pattern Growth Association Algorithm

    Get PDF
    This article belongs to the Special Issue Addressing the Growing Burden of Chronic Diseases and Multimorbidity: Characterization and InterventionsThe current availability of electronic health records represents an excellent research opportunity on multimorbidity, one of the most relevant public health problems nowadays. However, it also poses a methodological challenge due to the current lack of tools to access, harmonize and reuse research datasets. In FAIR4Health, a European Horizon 2020 project, a workflow to implement the FAIR (findability, accessibility, interoperability and reusability) principles on health datasets was developed, as well as two tools aimed at facilitating the transformation of raw datasets into FAIR ones and the preservation of data privacy. As part of this project, we conducted a multicentric retrospective observational study to apply the aforementioned FAIR implementation workflow and tools to five European health datasets for research on multimorbidity. We applied a federated frequent pattern growth association algorithm to identify the most frequent combinations of chronic diseases and their association with mortality risk. We identified several multimorbidity patterns clinically plausible and consistent with the bibliography, some of which were strongly associated with mortality. Our results show the usefulness of the solution developed in FAIR4Health to overcome the difficulties in data management and highlight the importance of implementing a FAIR data policy to accelerate responsible health research.This study was performed in the framework of FAIR4Health, a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement number 824666. Also, this research has been co-supported by the Carlos III National Institute of Health, through the IMPaCT Data project (code IMP/00019), and through the Platform for Dynamization and Innovation of the Spanish National Health System industrial capacities and their effective transfer to the productive sector (code PT20/00088), both co-funded by European Regional Development Fund (FEDER) ‘A way of making Europe’, and by REDISSEC (RD16/0001/0005) and RICAPPS (RD21/0016/0019) from Carlos III National Institute of Health. This work was also supported by Instituto de Investigación Sanitaria Aragón and Carlos III National Institute of Health [Río Hortega Program, grant number CM19/00164].Peer reviewe

    FAIR4Health: Findable, Accessible, Interoperable and Reusable data to foster Health Research

    Get PDF
    Due to the nature of health data, its sharing and reuse for research are limited by ethical, legal and technical barriers. The FAIR4Health project facilitated and promoted the application of FAIR principles in health research data, derived from the publicly funded health research initiatives to make them Findable, Accessible, Interoperable, and Reusable (FAIR). To confirm the feasibility of the FAIR4Health solution, we performed two pathfinder case studies to carry out federated machine learning algorithms on FAIRified datasets from five health research organizations. The case studies demonstrated the potential impact of the developed FAIR4Health solution on health outcomes and social care research. Finally, we promoted the FAIRified data to share and reuse in the European Union Health Research community, defining an effective EU-wide strategy for the use of FAIR principles in health research and preparing the ground for a roadmap for health research institutions. This scientific report presents a general overview of the FAIR4Health solution: from the FAIRification workflow design to translate raw data/metadata to FAIR data/metadata in the health research domain to the FAIR4Health demonstrators' performance.This research was financially supported by the European Union’s Horizon 2020 research and innovation programme under the grant agreement No 824666 (project FAIR4Health). Also, this research has been co-supported by the Carlos III National Institute of Health, through the IMPaCT Data project (code IMP/00019), and through the Platform for Dynamization and Innovation of the Spanish National Health System industrial capacities and their effective transfer to the productive sector (code PT20/00088), both co-funded by European Regional Development Fund (FEDER) ‘A way of making Europe’.Peer reviewe

    Christophe Gaudet-Blavignac, chercheur au Département de radiologie et informatique médicale de la Faculté de médecine de l’UNIGE. <p>--------</p>Christophe Gaudet-Blavignac, researcher at the Department of Radiology and Medical Informatics of the Faculty of Medicine of the UNIGE.

    No full text
    Christophe Gaudet-Blavignac, chercheur au Département de radiologie et informatique médicale de la Faculté de médecine de l’UNIGE. --------Christophe Gaudet-Blavignac, researcher at the Department of Radiology and Medical Informatics of the Faculty of Medicine of the UNIGE

    Semantic interoperability of clinicaldata: a multi-dimensional approach

    No full text
    With the progress of healthcare digitalization and the growing production of data, interoperability has become a major hurdle in all communities and for all usages, is it care, administrative support, research or public health. However, despite a profusion of standards, interoperability remains an unresolved challenge. In this thesis, a multi-dimensional approach for semantic interoperability on various types of data and for multiple use cases is presented. The proposed solution is based firstly on strong semantics by using compositional controlled vocabularies to create a computer-readable interlingua without enforcing a data model, it then restricts the representation complexity to a useful set of concepts encountered in practice and finally exploits the compositional capabilities of SNOMED CT to represent complex narrative data into post-coordinated SNOMED CT sentences. This approach defines a new, semantically interoperable landscape for clinical data that can leverage new opportunities proposed by the growth of personalized medicine

    Geneva University Hospitals Common Problem List

    No full text
    Abstract**************************************** This work is licensed under CC BY-SA 4.0 **************************************** LPH - Problem list Version: 20220401 This archive contains the problem list developed by the Division of Medical Information Sciences (SIMED) of the Geneva University Hospitals (HUG). This list has been manually developed and encoded in various international standards. Any use of this work has to be in the limitations of the CC BY-SA 4.0 licence. Citing this work: To cite this work in a publication, please cite the following article: Gaudet-Blavignac C, Rudaz A, Lovis C Building a Shared, Scalable, and Sustainable Source for the Problem-Oriented Medical Record: Developmental Study JMIR Med Inform 2021;9(10):e29174 URL: https://medinform.jmir.org/2021/10/e29174 DOI: 10.2196/2917

    Serious games in health care: a survey

    No full text
    to present an overview of existing serious games in healthcare designed for patients, and the evaluation of their effects. Such games, aiming to help patients better understand their condition or treatment, to foster healthy behaviors, or even to participate in therapies, are expected to grow in parallel with the importance of the videogaming industry

    Building a Shared, Scalable, and Sustainable Source for the Problem-Oriented Medical Record: Developmental Study

    No full text
    BackgroundSince the creation of the problem-oriented medical record, the building of problem lists has been the focus of many studies. To date, this issue is not well resolved, and building an appropriate contextualized problem list is still a challenge. ObjectiveThis paper aims to present the process of building a shared multipurpose common problem list at the Geneva University Hospitals. This list aims to bridge the gap between clinicians’ language expressed in free text and secondary uses requiring structured information. MethodsWe focused on the needs of clinicians by building a list of uniquely identified expressions to support their daily activities. In the second stage, these expressions were connected to additional information to build a complex graph of information. A list of 45,946 expressions manually extracted from clinical documents was manually curated and encoded in multiple semantic dimensions, such as International Classification of Diseases, 10th revision; International Classification of Primary Care 2nd edition; Systematized Nomenclature of Medicine Clinical Terms; or dimensions dictated by specific usages, such as identifying expressions specific to a domain, a gender, or an intervention. The list was progressively deployed for clinicians with an iterative process of quality control, maintenance, and improvements, including the addition of new expressions or dimensions for specific needs. The problem management of the electronic health record allowed the measurement and correction of encoding based on real-world use. ResultsThe list was deployed in production in January 2017 and was regularly updated and deployed in new divisions of the hospital. Over 4 years, 684,102 problems were created using the list. The proportion of free-text entries decreased progressively from 37.47% (8321/22,206) in December 2017 to 18.38% (4547/24,738) in December 2020. In the last version of the list, over 14 dimensions were mapped to expressions, among which 5 were international classifications and 8 were other classifications for specific uses. The list became a central axis in the electronic health record, being used for many different purposes linked to care, such as surgical planning or emergency wards, or in research, for various predictions using machine learning techniques. ConclusionsThis study breaks with common approaches primarily by focusing on real clinicians’ language when expressing patients’ problems and secondarily by mapping whatever is required, including controlled vocabularies to answer specific needs. This approach improves the quality of the expression of patients’ problems while allowing the building of as many structured dimensions as needed to convey semantics according to specific contexts. The method is shown to be scalable, sustainable, and efficient at hiding the complexity of semantics or the burden of constraint-structured problem list entry for clinicians. Ongoing work is analyzing the impact of this approach on how clinicians express patients’ problems

    Use of the Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT) for Processing Free Text in Health Care: Systematic Scoping Review

    No full text
    Interoperability and secondary use of data is a challenge in health care. Specifically, the reuse of clinical free text remains an unresolved problem. The Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT) has become the universal language of health care and presents characteristics of a natural language. Its use to represent clinical free text could constitute a solution to improve interoperability

    Clinical Data Models at University Hospitals of Geneva

    No full text
    In order to reuse data for clinical research it is then necessary to overcome two main challenges - to formalize data sources and to increase the portability. Once the challenge is resolved, it then will allow research applications to reuse clinical data. In this paper, three data models such as entity-attribute-value, ontological and data-driven are described. Their further implementation at University Hospitals of Geneva (HUG) in the data integration methodologies for operational healthcare data sources of the European projects such as DebugIT and EHR4CR and national project the Swiss Transplant Cohort Study are explained. In these methodologies the clinical data are either aligned according to standardised terminologies using different processing techniques or transformed and loaded directly to data models. Then these models are compared and discussed based on the quality criteria. The comparison shows that the described data models are strongly dependent on the objectives of the projects
    corecore